Measuring the Complexity of Join Enumeration in Query Optimization

نویسندگان

  • Kiyoshi Ono
  • Guy M. Lohman
چکیده

Since relational database management systems typically support only diadic join operators as primitive operations, a query optimizer must choose the “best” scquence of two-way joins to achieve the N-way join of tables requested by a query. The computational complexity of this optimization process is dominated by the number of such possible sequences that must bc evaluated by the optimizer. This paper describes and measures the performance of the Starburst join enumerator, which can parameterically adjust for each query the space of join sequences that arc evaluated by the optimizer to allow or disallow (I) composite tables (i.e., tables that are themselves the result of a join) as the inner operand of a join and (2) joins between two tables having no join predicate linking them (i.e., Cartesian products). To limit the size of their optimizer’s search space, most earlier systems excludcd both of these types of plans, which can exccutc significantly faster for some queries. Dy experimentally varying the parameters of the Starburst join enumerator, we have validated analytic formulas for the number of join sequcnccs under a variety of conditions, and have proven their dependence upon the “shape” of the query. Specifically, ‘linear” queries, in which tables arc connectcd by binary predicates in a straight lint, can hc optimized in polynomial time. llence the dynamic programming techniques of System R and R* can still be used to optimize linear queries of as many as 100 tables in a reasonable amount of time! A query optimizer in a relational DRMS translates non-procedural queries into a pr0cedura.l plan for execution, typically hy generating many alternative plans, estimating the execution cost of each, and choosing the plan having the lowest estimated cost. Increasing this set offeasilile plans that it evaluates improves the chances but dots not guarantee! that it will find a bcttct plan, while increasing the (compile-time) cost for it to optimize the query. A major challenge in the design of a query optimizer is to ensure that the set of feasible plans contains cflicient plans without making the :set too big to he gcncratcd practically.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graceful Degradation for Top-Down Join Enumeration via similar sub-queries measure on Chip Multi-Processor

Most contemporary database systems query optimizers exploit System-R’s dynamic programming method (DP) to find the optimal query execution plan (QEP) without evaluating redundant sub-plans. However, in the relational database setting today, large queries containing many joins are becoming increasingly common. Based on this trend, it has become temping to improve the DP performance. Chip Multi-P...

متن کامل

Parallelizing query optimization

Many commercial RDBMSs employ cost-based query optimization exploiting dynamic programming (DP) to efficiently generate the optimal query execution plan. However, optimization time increases rapidly for queries joining more than 10 tables. Randomized or heuristic search algorithms reduce query optimization time for large join queries by considering fewer plans, sacrificing plan optimality. Thou...

متن کامل

Algorithms for Efficient Top-Down Join Enumeration

For a DBMS that provides support for a declarative query language like SQL, the query optimizer is a crucial piece of software. The declarative nature of a query allows it to be translated into many equivalent evaluation plans. The process of choosing a suitable plan from all alternatives is known as query optimization. The basis of this choice are a cost model and statistics over the data. Ess...

متن کامل

The Complexity of Transformation-Based Join Enumeration

Query optimizers that explore a search space exhaustively using transformation rules usually apply all possible rules on each alternative, and stop when no new information is produced. A memoizing structure was proposed in [McK93] to improve the re-use of common subexpression, thus improving the efficiency of the search considerably. However, a question that remained open is, what is the comple...

متن کامل

Counter Strike: Generic Top-Down Join Enumeration for Hypergraphs

Finding the optimal execution order of join operations is a crucial task of today’s cost-based query optimizers. There are two approaches to identify the best plan: bottom-up and top-down join enumeration. But only the top-down approach allows for branchand-bound pruning, which can improve compile time by several orders of magnitude while still preserving optimality. For both optimization strat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990